IDSIA - 19 - 97 April 19 , 1997 revised August 21 , 1998 Centering Neural Network Gradient Factors ?

نویسنده

Nicol N. Schraudolph

چکیده

It has long been known that neural networks can learn faster when their input and hidden unit activities are centered about zero; recently we have extended this approach to also encompass the centering of error signals [2]. Here we generalize this notion to all factors involved in the network’s gradient, leading us to propose centering the slope of hidden unit activation functions as well. Slope centering removes the linear component of backpropagated error; this improves credit assignment in networks with shortcut connections. Benchmark results show that this can speed up learning significantly without adversely affecting the trained network’s generalization ability.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Two-tier symmetry-breaking model of patterns on a catalytic surface

L. M. Pismen, R. Imbihl, B. Y. Rubinstein, and M. I. Monin Department of Chemical Engineering, Technion, Technion City, 32000 Haifa, Israel Minerva Center for Nonlinear Physics of Complex Systems, Technion, Technion City, 32000 Haifa, Israel Institut für Physikalische Chemie und Elektrochemie, Universität Hannover, D-30167 Hannover, Germany ~Received 19 August 1997; revised manuscript received ...

متن کامل

Vesicle electrohydrodynamic simulations by coupling immersed boundary and immersed interface method

Article history: Received 21 August 2015 Received in revised form 29 February 2016 Accepted 16 April 2016 Available online 19 April 2016

متن کامل

Slope Centering : Making Shortcut Weights Effective ∗

Shortcut connections are a popular architectural feature of multi-layer perceptrons. It is generally assumed that by implementing a linear submapping, shortcuts assist the learning process in the remainder of the network. Here we find that this is not always the case: shortcut weights may also act as distractors that slow down convergence and can lead to inferior solutions. This problem can be ...

متن کامل

Accelerated Gradient Descent by Factor-Centering Decomposition

Gradient factor centering is a new methodology for decomposing neural networks into biased and centered subnets which are then trained in parallel. The decomposition can be applied to any pattern-dependent factor in the network’s gradient, and is designed such that the subnets are more amenable to optimization by gradient descent than the original network: biased subnets because of their simpli...

متن کامل